Feature attribution a.k.a. input salience methods which assign an importance score to a feature are abundant but may produce surprisingly different results for the same model on the same input. While differences are expected if disparate definitions of importance are assumed, most methods claim to provide faithful attributions and point at the features most relevant for a model's prediction. Existing work on faithfulness evaluation is not conclusive and does not provide a clear answer as to how different methods are to be compared. Focusing on text classification and the model debugging scenario, our main contribution is a protocol for faithfulness evaluation that makes use of partially synthetic data to obtain ground truth for feature impo...
While a lot of research in explainable AI focuses on producing effective explanations, less work is ...
Conventional saliency maps highlight input features to which neural network predictions are highly s...
Pre-trained language models (PLMs) like BERT are being used for almost all language-related tasks, b...
Pretrained transformer-based models such as BERT have demonstrated state-of-the-art predictive perfo...
Saliency methods calculate how important each input feature is to a machine learning model's predict...
This thesis focuses on model interpretability, an area concerned with under- standing model predicti...
To explain NLP models a popular approach is to use importance measures, such as attention, which inf...
Feature attribution methods are popular in interpretable machine learning. These methods compute the...
International audienceComplex machine learning algorithms are used more and more often in critical t...
Saliency methods are a popular class of feature attribution explanation methods that aim to capture ...
Saliency methods provide post-hoc model interpretation by attributing input features to the model ou...
Saliency methods compute heat maps that highlight portions of an input that were most {\em important...
Neural network architectures in natural language processing often use attention mechanisms to produc...
A popular approach to unveiling the black box of neural NLP models is to leverage saliency methods, ...
Leveraging the characteristics of convolutional layers, neural networks are extremely effective for ...
While a lot of research in explainable AI focuses on producing effective explanations, less work is ...
Conventional saliency maps highlight input features to which neural network predictions are highly s...
Pre-trained language models (PLMs) like BERT are being used for almost all language-related tasks, b...
Pretrained transformer-based models such as BERT have demonstrated state-of-the-art predictive perfo...
Saliency methods calculate how important each input feature is to a machine learning model's predict...
This thesis focuses on model interpretability, an area concerned with under- standing model predicti...
To explain NLP models a popular approach is to use importance measures, such as attention, which inf...
Feature attribution methods are popular in interpretable machine learning. These methods compute the...
International audienceComplex machine learning algorithms are used more and more often in critical t...
Saliency methods are a popular class of feature attribution explanation methods that aim to capture ...
Saliency methods provide post-hoc model interpretation by attributing input features to the model ou...
Saliency methods compute heat maps that highlight portions of an input that were most {\em important...
Neural network architectures in natural language processing often use attention mechanisms to produc...
A popular approach to unveiling the black box of neural NLP models is to leverage saliency methods, ...
Leveraging the characteristics of convolutional layers, neural networks are extremely effective for ...
While a lot of research in explainable AI focuses on producing effective explanations, less work is ...
Conventional saliency maps highlight input features to which neural network predictions are highly s...
Pre-trained language models (PLMs) like BERT are being used for almost all language-related tasks, b...